The MSRA machine translation system for IWSLT 2010

نویسندگان

  • Chi-Ho Li
  • Nan Duan
  • Yinggong Zhao
  • Shujie Liu
  • Lei Cui
  • Mei-Yuh Hwang
  • Amittai Axelrod
  • Jianfeng Gao
  • Yaodong Zhang
  • Li Deng
چکیده

This paper describes the systems of, and the experiments by, Microsoft Research Asia (MSRA), with the support of Microsoft Research (MSR), in the IWSLT 2010 evaluation campaign. We participated in all tracks of the DIALOG task (Chinese/English). While we follow the general training and decoding routine of statistical machine translation (SMT) and that of MT output combination, it is our first time to try our ideas in post-processing output of automatic speech recognition (ASR) before feeding it to SMT decoders. Our findings are: (1) it does not help to use the complete N-best ASR output; rather, the best translation performance is achieved by taking the top one candidate after Minimum Bayes Risk re-ranking of the N-best ASR output; (2) as to punctuation recovery, the best performance is achieved by splitting the problem into two steps, viz. the prediction of punctuation position and the prediction of punctuation given a position.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I2r's machine translation system for IWSLT 2010

In this paper, we describe the system and approach used by Institute for Infocomm Research (IR) for IWSLT 2010 spoken language translation evaluation campaign. We apply system combination on top of two kinds of statistical machine translation system, namely, phrase-based system and syntaxbased system. Experimental results show consistent improvements on DIALOG Task.

متن کامل

Apptek's APT machine translation system for IWSLT 2010

In this paper, we describe AppTek’s new APT machine translation system that we employed in the IWSLT 2010 evaluation campaign. This year, we participated in the Arabic-toEnglish and Turkish-to-English BTEC tasks. We discuss the architecture of the system, the preprocessing steps and the experiments carried out during the campaign. We show that competitive translation quality can be obtained wit...

متن کامل

LIUM's statistical machine translation system for IWSLT 2010

This paper describes the two systems developed by the LIUM laboratory for the 2010 IWSLT evaluation campaign. We participated to the new English to French TALK task. We developed two systems, one for each evaluation condition, both being statistical phrase-based systems using the the Moses toolkit. Several approaches were investigated.

متن کامل

The POSTECH's statistical machine translation system for the IWSLT 2010

In this paper, we utilize segmentation alternatives. Our research contribution is a novel estimation method of the translation probabilities used in phrase-based statistical machine translation in order to reflect the trustworthiness of the segmentation. Our system, however, underperforms the baseline.

متن کامل

The KIT translation system for IWSLT 2010

This paper presents the KIT systems participating in the French to English BTEC and in the English to French TALK Translation tasks in the framework of the IWSLT 2010 machine translation evaluation. Starting with a state-of-the art phrase-based translation system we tested different modifications and extensions to improve the translation quality of the system. First, we improved the word reorde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010